自从17世纪以来,理论上就建立了非语言交流的\ Esquote*{Language}的手势。但是,它与视觉艺术的相关性仅偶尔表达。这可能主要是由于传统上必须手工处理的大量数据。但是,随着数字化的稳定进展,越来越多的历史文物被索引并提供给公众,从而需要自动检索具有类似身体星座或姿势的艺术历史图案。由于艺术领域因其风格差异而与现有的人类姿势估计的现实世界数据集有很大不同,因此提出了新的挑战。在本文中,我们提出了一种新颖的方法来估计艺术历史图像中的人类姿势。与以前试图用预训练模型或通过样式转移弥合域间隙的工作相反,我们建议对对象和关键点检测进行半监督学习。此外,我们引入了一个新颖的特定领域艺术数据集,其中包括人物的边界框和关键点注释。与使用预训练模型或样式转移的方法相比,我们的方法取得了明显更好的结果。
translated by 谷歌翻译
This short report reviews the current state of the research and methodology on theoretical and practical aspects of Artificial Neural Networks (ANN). It was prepared to gather state-of-the-art knowledge needed to construct complex, hypercomplex and fuzzy neural networks. The report reflects the individual interests of the authors and, by now means, cannot be treated as a comprehensive review of the ANN discipline. Considering the fast development of this field, it is currently impossible to do a detailed review of a considerable number of pages. The report is an outcome of the Project 'The Strategic Research Partnership for the mathematical aspects of complex, hypercomplex and fuzzy neural networks' meeting at the University of Warmia and Mazury in Olsztyn, Poland, organized in September 2022.
translated by 谷歌翻译
Distributionally robust optimization (DRO) can improve the robustness and fairness of learning methods. In this paper, we devise stochastic algorithms for a class of DRO problems including group DRO, subpopulation fairness, and empirical conditional value at risk (CVaR) optimization. Our new algorithms achieve faster convergence rates than existing algorithms for multiple DRO settings. We also provide a new information-theoretic lower bound that implies our bounds are tight for group DRO. Empirically, too, our algorithms outperform known methods
translated by 谷歌翻译
Profile extrusion is a continuous production process for manufacturing plastic profiles from molten polymer. Especially interesting is the design of the die, through which the melt is pressed to attain the desired shape. However, due to an inhomogeneous velocity distribution at the die exit or residual stresses inside the extrudate, the final shape of the manufactured part often deviates from the desired one. To avoid these deviations, the shape of the die can be computationally optimized, which has already been investigated in the literature using classical optimization approaches. A new approach in the field of shape optimization is the utilization of Reinforcement Learning (RL) as a learning-based optimization algorithm. RL is based on trial-and-error interactions of an agent with an environment. For each action, the agent is rewarded and informed about the subsequent state of the environment. While not necessarily superior to classical, e.g., gradient-based or evolutionary, optimization algorithms for one single problem, RL techniques are expected to perform especially well when similar optimization tasks are repeated since the agent learns a more general strategy for generating optimal shapes instead of concentrating on just one single problem. In this work, we investigate this approach by applying it to two 2D test cases. The flow-channel geometry can be modified by the RL agent using so-called Free-Form Deformation, a method where the computational mesh is embedded into a transformation spline, which is then manipulated based on the control-point positions. In particular, we investigate the impact of utilizing different agents on the training progress and the potential of wall time saving by utilizing multiple environments during training.
translated by 谷歌翻译
Many real-world reinforcement learning tasks require control of complex dynamical systems that involve both costly data acquisition processes and large state spaces. In cases where the transition dynamics can be readily evaluated at specified states (e.g., via a simulator), agents can operate in what is often referred to as planning with a \emph{generative model}. We propose the AE-LSVI algorithm for best-policy identification, a novel variant of the kernelized least-squares value iteration (LSVI) algorithm that combines optimism with pessimism for active exploration (AE). AE-LSVI provably identifies a near-optimal policy \emph{uniformly} over an entire state space and achieves polynomial sample complexity guarantees that are independent of the number of states. When specialized to the recently introduced offline contextual Bayesian optimization setting, our algorithm achieves improved sample complexity bounds. Experimentally, we demonstrate that AE-LSVI outperforms other RL algorithms in a variety of environments when robustness to the initial state is required.
translated by 谷歌翻译
In this paper, we present strong baselines for the task of Feedback Comment Generation for Writing Learning. Given a sentence and an error span, the task is to generate a feedback comment explaining the error. Sentences and feedback comments are both in English. We experiment with LLMs and also create multiple pseudo datasets for the task, investigating how it affects the performance of our system. We present our results for the task along with extensive analysis of the generated comments with the aim of aiding future studies in feedback comment generation for English language learners.
translated by 谷歌翻译
Foundation models can be disruptive for future AI development by scaling up deep learning in terms of model size and training data's breadth and size. These models achieve state-of-the-art performance (often through further adaptation) on a variety of tasks in domains such as natural language processing and computer vision. Foundational models exhibit a novel {emergent behavior}: {In-context learning} enables users to provide a query and a few examples from which a model derives an answer without being trained on such queries. Additionally, {homogenization} of models might replace a myriad of task-specific models with fewer very large models controlled by few corporations leading to a shift in power and control over AI. This paper provides a short introduction to foundation models. It contributes by crafting a crisp distinction between foundation models and prior deep learning models, providing a history of machine learning leading to foundation models, elaborating more on socio-technical aspects, i.e., organizational issues and end-user interaction, and a discussion of future research.
translated by 谷歌翻译
Timely and effective feedback within surgical training plays a critical role in developing the skills required to perform safe and efficient surgery. Feedback from expert surgeons, while especially valuable in this regard, is challenging to acquire due to their typically busy schedules, and may be subject to biases. Formal assessment procedures like OSATS and GEARS attempt to provide objective measures of skill, but remain time-consuming. With advances in machine learning there is an opportunity for fast and objective automated feedback on technical skills. The SimSurgSkill 2021 challenge (hosted as a sub-challenge of EndoVis at MICCAI 2021) aimed to promote and foster work in this endeavor. Using virtual reality (VR) surgical tasks, competitors were tasked with localizing instruments and predicting surgical skill. Here we summarize the winning approaches and how they performed. Using this publicly available dataset and results as a springboard, future work may enable more efficient training of surgeons with advances in surgical data science. The dataset can be accessed from https://console.cloud.google.com/storage/browser/isi-simsurgskill-2021.
translated by 谷歌翻译
Incorporating computed tomography (CT) reconstruction operators into differentiable pipelines has proven beneficial in many applications. Such approaches usually focus on the projection data and keep the acquisition geometry fixed. However, precise knowledge of the acquisition geometry is essential for high quality reconstruction results. In this paper, the differentiable formulation of fan-beam CT reconstruction is extended to the acquisition geometry. This allows to propagate gradient information from a loss function on the reconstructed image into the geometry parameters. As a proof-of-concept experiment, this idea is applied to rigid motion compensation. The cost function is parameterized by a trained neural network which regresses an image quality metric from the motion affected reconstruction alone. Using the proposed method, we are the first to optimize such an autofocus-inspired algorithm based on analytical gradients. The algorithm achieves a reduction in MSE by 35.5 % and an improvement in SSIM by 12.6 % over the motion affected reconstruction. Next to motion compensation, we see further use cases of our differentiable method for scanner calibration or hybrid techniques employing deep models.
translated by 谷歌翻译
Temporal data like time series are often observed at irregular intervals which is a challenging setting for existing machine learning methods. To tackle this problem, we view such data as samples from some underlying continuous function. We then define a diffusion-based generative model that adds noise from a predefined stochastic process while preserving the continuity of the resulting underlying function. A neural network is trained to reverse this process which allows us to sample new realizations from the learned distribution. We define suitable stochastic processes as noise sources and introduce novel denoising and score-matching models on processes. Further, we show how to apply this approach to the multivariate probabilistic forecasting and imputation tasks. Through our extensive experiments, we demonstrate that our method outperforms previous models on synthetic and real-world datasets.
translated by 谷歌翻译